Connectionist speaker normalization and adaptation
نویسندگان
چکیده
In a speaker-independent, large-vocabulary continuous speech recognition systems, recognition accuracy varies considerably from speaker to speaker, and performance may be significantly degraded for outlier speakers such as nonnative talkers. In this paper, we explore supervised speaker adaptation and normalization in the MLP component of a hybrid hidden Markov model/ multilayer perceptron version of SRI's DECIPHERTM speech recognition system. Normalization is implemented through an additional transformation network that preprocesses the cepstral input to the MLP. Adaptation is accomplished through incremental retraining of the MLP weights on adaptation data. Our approach combines both adaptation and normalization in a single, consistent manner, works with limited adaptation data, and is text-independent. We show significant improvement in recognition accuracy.
منابع مشابه
Invariant integration features combined with speaker-adaptation methods
Speaker-normalization and -adaptation methods are essential components of state-of-the-art speech recognition systems nowadays. Recently, so-called invariant integration features were presented which are motivated by the theory of invariants. While it was shown that the integration features outperform MFCCs when used with a basic monophone recognition system, it was left open, if their benefits...
متن کاملConnectionist Speaker Normalization and Its Applications to Speech Recognition
Speaker normalization may have a significant impact on both speakeradaptive and speaker-independent speech recognition. In this paper, a codeworddependent neural network (CDNN) is presented for speaker normalization. The network is used as a nonlinear mapping function to transform speech data between two speakers. The mapping function is characterized by two important properties. First, the ass...
متن کاملD-MAP: a distance-normalized MAP estimation of speaker models for automatic speaker verification
In this paper we introduce a MAP estimation of speaker models in Automatic Speaker Verification with a distance constraint: the D-MAP adaptation. The D-MAP is based on the Kullback-Leibler distances and provides an easy way to automatically compute a speaker-dependent adaptation of the model parameters. We formulate a distance constrained MAP criterion and we show an equivalence between the D-M...
متن کاملVocal Tract Length Normalization for Large Vocabulary Continuous Speech Recognition
Generally speaking, the speaker-dependence of a speech recognition system stems from speaker-dependent speech feature. The variation of vocal tract length and/or shape is one of the major source of inter-speaker variations. In this paper, we address several methods of vocal tract length normalization (VTLN) for large vocabulary continuous speech recognition: (1) explore the bilinear warping VTL...
متن کاملAutomatic detection of the second subglottal resonance and its application to speaker normalization.
Speaker normalization typically focuses on inter-speaker variabilities of the supraglottal (vocal tract) resonances, which constitute a major cause of spectral mismatch. Recent studies have shown that the subglottal airways also affect spectral properties of speech sounds, and promising results were reported using the subglottal resonances for speaker normalization. This paper proposes a reliab...
متن کامل